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SCHIZOPHRENIA ASSOCIATED GENES 



The present invention relates to the identification of 
genes which have been disrupted in patients diagnosed as 
suffering from schizophrenia and/or bi-polar affective 
disorder, as well as proteins encoded by the gene and 
antibodies thereto and to uses of such products as 
medicaments for treating schizophrenia and/or affective 
psychosis. The invention also relates to methods for 
diagnosing patients suffering or predisposed to 
schizophrenia and/or affective psychosis, as well as 
screens for developing novel treatment regimes for 
schizophrenia and/or affective psychosis. 

Schizophrenia and Bipolar Affective Disorder are 
common and debilitating psychiatric disorders. Despite a 
wealth of information on the epidemiology, neuroanatomy and 
pharmacology of the illness, it is uncertain what molecular 
pathways are involved and how impairments in these affect 
brain development and neuronal function. Despite an 
estimated heritability of 60-80%, very little is known 
about the number or identity of genes involved in these 
psychoses. Although there has been recent progress in 
linkage and association studies, especially from genome- 
wide scans, these studies have yet to progress from the 
identification of susceptibility loci or candidate genes to 
the full characterisation of disease-causing genes 
(Berrettini, 2 000) , 

The cloning of breakpoints in patients with chromosome 
abnormalities (translocations, inversions etc) has proved 
instrumental in the identification of many disease genes 
(e.g. Duchenne Muscular Dystrophy, Retinoblastoma, Wilm's 
Tumour, Familial Polyposis Coli, Fragile-X Syndrome, 
Polycystic Kidney Disease, many leukaemias and, very 
recently, a candidate speech and language disorder gene 
(Lai et al, 2001)). Such studies assume that the 
chromosomal breakpoints give rise to the clinical symptoms 
by either directly disrupting gene sequences or perturbing 
gene expression. In the same way that gene-trap 
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mutagenesis can be used to identify disrupted n\ouse genes 
(Brennan & Skarnes, 1999), the physical "flag" created by 
a cytogenetic breakpoint provides a geographical pointer 
for the disease locus. 

It is amongst the objects of the present invention to 
provide genes and/or proteins postulated to be involved 
with the development and/or symptoms associated with 
schizophrenia and/or affective psychosis. 

As will be seen, the present invention is based on 
the molecular characterisation of a chromosomal disruption 
in subjects diagnosed as suffering from a schizophrenia 
and/or affective psychosis. A high-throughput Fluorescence 
in situ Hybridisation (FISH) -based approach has been 
adopted to map the chromosomal breakpoints in these 
patients. ^ Consultation, of the sequence data at the 
breakpoint locus not only allows efficient FISH probe 
selections to be made by the targeting of coding regions, 
but also proof of gene disruption can be made entirely by 
relating the exact position of probes to the genomic 
structure of a candidate gene. 

Four patients have been studied and their chromosomal 
disruptions characterised. Hereinafter the patients will 
be identified as patients 1-4. 

As will be seen, in one embodiment, the present 
invention is based on the molecular characterisation of a 
chromosomal rearrangement denoted t (3 ;8) (pl3 ;p22) in a 
subject (patient 1) diagnosed as suffering from a 
schizoaffective disorder (see Fig.l). A high-throughput 
Fluorescence in situ Hybridisation (FISH) -based approach 
was adopted to map the chromosomal breakpoints in these 
patients. Consultation of the sequence data at the 
breakpoint loci not only allowed efficient FISH probe 
selections to be made by the targeting of coding regions, 
but also proof of gene disruption was inferred entirely by 
relating the exact position of probes to the genomic 
structure of a candidate gene. 
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One breakpoint (located on chromosome 8p22) in this 
subject lies near to a gene, N33, involved in the N-Linked 
Glycosylation pathway. 

This pathway consists of three stages. Firstly the 
assembly of a donor oligosaccharide at the endoplasmic 
reticulum lumen membrane. Secondly, the transfer of this 
molecule onto newly translated secretory and transmembrane 
proteins catalyzed by the ol igosaccharyltransf erase (OST) 
complex. Thirdly, there is subsequent modification of the 
oligosaccharides on the glycoprotein. N33 encodes a 
protein thought to be involved in the second stage of the 
pathway by analogy with yeast homologues. Without wishing 
to be bound by theory it is hypothesised that the 
breakpoint in the subject perturbs N33 expression 
indirectly through position effect silencing or separation 
of regulatory elements from the gene promoter (both effects 
have been shown to occur even when the breakpoints are up 
to 1Mb from the target gene in some instances (Kleinjan et 
al 1999) ) . 

As the N33 gene is located within a chromosomal 
region repeatedly found positive in schizophrenia linkage 
studies the present inventors pursued this gene further by 
association study . 

Certain microsatellite repeat haplotypes have been 
identified at the N3 3 locus which are over-represented in 
schizophrenic patients and their families compared to the 
normal population. Subsequent sequencing of the N33 gene 
in haplotype carrying individuals is ongoing in order to 
identify causative mutations. 

The other breakpoint in this patient (3pl3) has now 
also been fully characterised and demonstrated to disrupt 
a gene, SEMCAP3 (also known as KIAA109 5) . The present 
invention is therefore also based on a proposed role of 
this gene (normal and mutated forms) in the aetiology of 
schizophrenia and/or affective psychosis. 
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In a further embodiment the present invention is 
based on the GRIK4 gene and observations of the present 
inventors of an- involvement of this gene and/or protein 
with schizophrenia and/or affective psychosis. 

The GRIK4 gene is also known as KAl and EAAl, but will 
herein be referred to as GRIK4 for simplicity, but should 
not be construed as limiting. 

The subject (patient 2) was one of a series of around 
100 patients with comorbid schizophrenia and mild learning 
disability (US terminology: "mental retardation") who were 
screened using routine G-band karyotyping. This patient 
possesses a complex chromosomal rearrangement which can be 
described by standard nomenclature as; (46, XX, ins (8; 11) 
(ql3;q2 3.3q24.2) inv(2) (pi 2q32 . 1) t (2 ; 11) (q2 1 - 3 ;q24 • 2 ) der 
(2) (2qter->2q32.1: :2pl2->2q21.3: :llq24.2- 
>llqter) der (11) ( llpter-> 1 lq2 3 . 3 : : 2q2 1 . 3 ->2q3 2 . 1: : 2pl2- 
>2pter)der(8) (8pter->8ql3 : : llq23 . 2->llq24 , 2 : : 8ql3->8qter) ) , 
It has been repeatedly observed that schizophrenia occurs 
more frequently in individuals with mild learning 
disability than in the general population and recent work 
has revealed an increased heritability of this comorbid 
state, ^ 

As described herein the FISH results reveal that the 
subject has a disruption in a brain expressed gene; namely, 
GRIK4 which is known to participate in molecular mechanisms 
responsible for modulating the strength of synaptic 
transmission. 

In a further embodiment the present invention is based 
on the characterisation of a balanced reciprocal 
translocation between chromosomes 9 and 14, 
t (9 ;14) (q34 ;ql3) in a mother (patient 3) with schizophrenia 
and her daughter with schizophrenia co-morbid with mild 
learning disability. A brain transcription factor gene, 
NPAS3 , is shown to be disrupted by the translocation at 
14ql3. Without wishing to be bound by theory, the present 
inventors hypothesis is that the disruption of this gene is 
responsible for the psychotic symptoms exhibited by the 
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mother and daughter. 

As will be seen, the present invention is also based 
on the molecular characterisation of a chromosomal 
rearrangement denoted t ( 1 ; 16 ) (p3 1 - 2 ; q2 1 ) (in patient 4). 

The proband met ICD-10 and DSM-IV criteria for 
definite schizophrenia. The translocation was inherited 
within other branches of the family with variable clinical 
expression. However some key translocation carriers of the 
subjects to whom the inventors had access had not passed 
the age of risk when clinically characterized • 

One breakpoint (located on chromosome lp31.2) in 
patient 4 lies within an alternatively spliced form of the 
gene, PDE4B, involved in the attenuation of cAMP secondary 
messenger signaling. 

The remaining breakpoint in this patient (16q21) has 
now also fully characterised and demonstrated to disrupt a 
gene, CADHERIN 8 (CDH8) . The present invention is therefore 
based in part on a proposed role of this gene in the 
aetiology of schizophrenia and/or affective psychosis. 

In a first aspect the present invention provides use 
of a polynucleotide fragment or fragments comprising 
SEMCAP3 , N33, NPAS3 , GRIK4 , PDE4B and/or CDH8 gene(s) or a 
fragment (s) , derivative (s) or homologue(s) thereof for the 
manufacture of a medicament for treating schizophrenia 
and/or affective psychosis in a subject. 

In another aspect the present invention provides use 
of a polypeptide fragment or fragments encoded by SEMCAP3 , 
N33, NPAS3 , GRIK4, PDE4B and/or CDHd gene(s), or a 
fragment (s) , .derivative(s) or homologue(s) thereof for the 
manufacture of a medicament for treating schizophrenia 
and/or affective psychosis in a subject. 

Schizophrenia and/or affective psychosis as used 
herein relates to schizophrenia, as well as other affective 
psychoses such as those listed in "The ICD-10 
Classification of Mental and Behavioural Disorders" World 
Health Organization, Geneva 1992. Categories F20 to F29 
inclusive includes Schizophrenia, schizotypal and 
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delusional disorders. Categories F30 to F39 inclusive are 
Mood (affective) disorders that include bipolar a-ffective 
disorder and depressive disorder. Mental Retardation is 
coded F70 to F79 inclusive. The Diagnostic and Statistical 
Manual of Mental Disorders, Fourth Edition (DSM-IV) . 
American Psychiatric Association, Washington DC. 1994, 
Include all conditions coded 295. xx (Schizophrenia and 
Other Psychotic Disorders) and 296. xx (Major Depressive 
Disorders and Bipolar Disorders). Mental Retardation is 
coded 315, 317, 318 and 319. 

SEMCAP3 has been previously cloned and sequenced in 
mouse as two alternative forms (Semcap3A and 3B) and the 
sequences are present in the public database (nucleic acid 
sequences; AF12 7 084 / AF127085 , respectively; protein 
sequences AAF22 13 1/ AAF2 2 132 , respectively) as directly 
submitted by Wang & Strittmatter , 1999. The human form of 
the gene is defined by sequence KIAA1095 (nucleic acid 
sequence, AB029018 or XM_041363, and a smaller form, 
BC014432; protein sequence, XP_041363). The genomic 
sequences corresponding to this gene are also present in 
the public database (eg. for BAC RPll-252olO, AC024102). 
Nevertheless, the prior art does not suggest any link 
between SEMCAP3 and schizophrenia and/or affective 
psychosis. 

Thus, references herein to the SEMCAP3 gene are 
understood to relate to the sequences in the public 
databases and identified in Fig. 3 and references to the 
SEMCAP3 protein sequence is understood to relate to the 
sequences in the public databases and identified in Fig.4- 

N33 has been previously cloned and sequenced and the 
sequence is present in the public database (Nucleic acid- 
sequence; U42349, Protein sequence; Q13454) and described 
in MacGrogan et al, 1996. The genomic sequences 

corresponding to this gene are also present in the public 
database (eg. for BAC RPll-23jl4) but some SNP 
polymorphisms or sequencing errors (eg. an extra "C" 
present in exon lb, see hereinafter - cctgcccCaccggg - may 
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result in differences to the sequences presented herein. 
Nevertheless, the prior art does not suggest any link 
between N33 and schizophrenia and affective psychosis. 

In addition to the sequences previously identified, 
the present inventors have identified a new start exon (la, 
see Figures 6 and 7) and have observed the complexity of 
the exon splicing at the 3' end of the gene (see Figures 6 
and 7) . 

Thus, references herein to the N33 gene are understood 
to relate to the sequences in the public databases and 
identified in Figures 6 and 7 and references to the N33 
protein sequence .are understood to relate to the sequences 
in the public databases and identified in Figures 6 and 7. 

The GRIK4 gene is located on chromosoine 11, at 
cytogenetic position llq22.3. The gene encodes a kainate 
receptor subunit and has been previously described by 
Kamboj et al, 1994. The cDNA nucleotide sequence and 
peptide sequence was disclosed by Kamboj et al, 1994 and 
submitted to the Genbank/EMBL database under accession 
NM_014619. The coding sequence of the gene is identified 
as being 2871 nucleotides in length, coding for a protein 
957 amino acids. The nucleotide and protein sequences are 
shown in Figures 10 and 11 respectively. The present 
inventors have identified an alternative start site for the 
gene (see Figures 15-17) which would result in a shorter 
gene/protein of 933 amino acids as opposed to 956. The 
full nucleotide sequence and protein sequence of this 
alternatively encoded gene/protein is shown in Figures 16 
and 17. 

Thus, references herein to the GRIK4 gene are 
understood to relate to the sequences identified in Figures 

10 and 16 and references to the GRIK4 protein sequence are 
understood to relate to the sequences identified in Figures 

11 and 17. 

The human form of NPAS3 has previously been identified 
and is found in the public database under accession numbers 
AB054575 and AF16443S, with the differences due to 
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alternative splicing and all forms are encompassed within 
the present invention. 

Thus, references herein to the NPAS3 gene are 
understood to relate to the sequences identified in Figures 

18 and 20 and references to the NPAS3 protein sequence are 
understood to relate to the sequences identified in Figures 

19 and 21. 

The PDE4B gene is located on chromosome 1 at 
cytogenetic position lp31,2. The gene encodes a 

phosphodiesterase which shows homology to the Dunce leaning 
and memory gene product of Drosophila mBlsnogaster , Bolger 
et al, 1993. Two long (PDE4B1 and PDE4B3) and one short 
(PDE4B2) splice form are described herein. There is a core 
protein sequence of 525 araino acid residues shared by all 
three forms. On to this is added 39 N-terminal amino acid 
residues in the case of PDE4B2. Both of the long forms 
share an additional central stretch of 118 amino acid 
residues, but then diverge at the N-terminal end of the 
proteins; PDE4B1 has 93 specific residues and PDE4B3, 78, 
It is predicted that only the PDE4B1 splice form (brain 
expressed) may be disrupted by the chromosomal abnormality 
observed in the patient and family. 

Thus, references herein to the ♦ PDE4B gene are 
understood to relate to the sequences identified in Figures 
25, 27 and 29 and references to the PDE4B protein sequence 
are understood to relate to the sequences identified in 
Figures 26, 28 and 30. 

CADHERIN 8 {CDH8) has been previously cloned and 
sequenced and the sequence is present in the public 
database (nucleic acid sequence; L34060/AB035305/NM_001796 , 
protein sequence; NP_0017B7) and described in Suzuki et 
al . , 1991, Tanihara et al . , 1994 , and Shimoyama et al . , 
2000. An alternative transcript form has been described in 
the rat in which there is a truncation within the 5""^ 
cadherin domain (Kido et al., 1998 and see Fig. 4). The 
accession numbers for the normal and truncated forms of 
CDH8 in rat are AB010436 and AB010437, respectively. The 
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corresponding human truncated transcript is not present in 
the public database and so is not yet confirmed. The 
genomic sequences corresponding to CDH8 are also present in 
the public database (eg. BAG CTC-420A11; AC040161) . 
Nevertheless, the prior art does not suggest any link 
between CDHS and schizophrenia and/or affective psychosis. 

Thus, references herein to the CDHS gene are 
understood to relate to the nucleic sequences in the public 
databases and identified in Fig. 35 and references to the 
CDHS protein sequences are understood to relate to the 
sequences in the public databases and identified in Fig. 36. 

In certain ' jurisdictions claims to methods of 
treatment are permissible and so the skilled reader will 
appreciate that the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CDHS gene(s), or fragment(s) , derivative (s) or 
homologue(s) thereof; or SEMCAP3 , N33, GRIK4, NPAS3, PDE4B 
and/or CDHS protein, or functionally active fragment (s) , 
derivative(s) , or homologue(s) thereof, may be administered 
to an individual as a method of treating an individual with 
schizophrenia and/or affective psychosis. 

"Polynucleotide fragment" as used herein refers to a 
chain of nucleotides such as deoxyribose nucleic acid (DNA) 
and transcription products thereof, such as RNA . 
Naturally, the skilled addressee will appreciate the whole 
naturally occurring human genome is not included . in the 
definition of polynucleotide fragment. 

The polynucleotide fragment can be isolated in the 
sense that it is substantially free of biological material 
with which the whole genome is normally associated in vivo. 
The isolated polynucleotide fragment may be cloned to 
provide a recombinant molecule comprising the 
polynucleotide fragment. Thus, "polynucleotide fragment 
includes double and single stranded DNA, RNA and 
polynucleotide sequences derived therefrom, for example, 
subsequences of said fragment and which are of any 
desirable length- Where a nucleic acid is single stranded 
then both a given strand and a sequence or reverse 
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complementary thereto is within the scope of the present 
invention , 

In general, the term "expression product" or "gene 
product" refers to both transcription and translation 
products of said polynucleotide fragments. When the 
expression or gene product is a "polypeptide" (i.e. a chain 
or sequence of amino acids displaying a biological activity 
substantially similar (eg. 98%, 95%, 90%, 80%, 75% 
activity) to the biological activity of the protein) , it 
does not refer to a specific length of the product as such. 
Thus, the skilled addressee will appreciate that 
"polypeptide" encompasses inter alia peptides, polypeptides 
and proteins. The polypeptide if required, can be modified 
in vivo and in vitro, for example by glycosylation, 
amidation, carboxy lat ion , phosphorylation and/or post- 
trans la tional cleavage . 

The present invention further provides a recombinant 
or synthetic polypeptide for the manufacture of reagents 
for use as therapeutic agents in the treatment of 
schizophrenia and/or affective psychosis. In particular, 
the invention provides pharmaceutical compositions 
comprising the recombinant or synthetic polypeptide 
together with a pharmaceutically acceptable carrier 
therefor . 

The present invention further provides an isolated 
polynucleotide fragment capable of specifically hybridising 
to a related polynucleotide sequence from another species. 
In this manner, the present invention provides probes 
and/or primers for use in ex vivo and/or in situ detection 
and expression studies. Typical detection studies include 
polymerase chain reaction (PGR) studies, hybridisation 
studies, or sequencing studies. In principle any specific 
polynucleotide sequence fragment from the identified 
sequences may be used in detection and/or expression 
studies. The skilled addressee understands that a specific 
fragment is a fragment of the sequence which is of 
sufficient length, generally greater than 10, 12, 14, 16 or 
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20 nucleotides in length, to bind specifically to the 
sequence, under conditions of high stringency, as defined 
herein, and not bind to unrelated sequences, that is 
sequences from elsewhere in the genome of the organism 
other than an allelic form of the sequence or non- 
homologous sequences from other organisms. 

"Capable of specifically hybridising" is taken to mean 
that said polynucleotide fragment preferably hybridises to 
a related or similar polynucleotide sequence in preference 
to unrelated or dissimilar polynucleotide sequences. 

The invention includes polynucleotide sequence (s) 
which are capable of specifically hybridising to an 
polynucleotide fragment as described herein or to a part 
thereof without necessarily being completely compleinentary 
or reverse complementary to said related polynucleotide 
sequence or fragment thereof. For example, there may be at 
least 50%, or at least 75%, at least 90%, or at least 95% 
complementarity. Of course, in some cases the sequences 
may be exactly reverse complementary (100% reverse 
complementary) or nearly so (e.g. there may be less than 
10, typically less than 5 mismatches) . Thus, the present 
invention also provides anti-sense or complementary 
nucleotide sequence (s) wl^ich is/are capable of specifically 
hybridising to the disclosed polynucleotide sequence. If 
a specific polynucleotide is to be used as a primer in PGR 
and/or sequencing studies, the polynucleotide must be 
capable of hybridising to related nucleic acid and capable 
of initiating chain extension from 3' end of the 
polynucleotide, but not able to correctly initiate chain 
extension from unrelated sequences. 

If a polynucleotide sequence of the present invention 
is to be used in hybridisation studies to obtain or 
identify a related sequence from another organism the 
polynucleotide sequence should preferably remain hybridised 
to a sample polynucleotide under, stringent conditions. If 
desired, either the test or sample polynucleotide may be 
immobilised. Generally the test polynucleotide sequence is 
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at least 10, 14, 20 or at least 50 bases in length. It may 
be labelled by suitable techniques known in the art. 
Preferably the test polynucleotide sequence is at least 200 
bases in length and may even be several kilobases in 
length. Thus, either a denatured sample or test sequence 
can be first bound to a support. Hybridization can be 
effected at a temperature of between 50 and 70' c in double 
strength SSC (2xNaCl 17,5g/l and sodium citrate (SC) at 
8.8g/l) buffered saline containing 0,1% sodium dodecyl 
sulphate (SDS) , This can be followed by rinsing of the 
support at the same temperature but with a buffer having a 
reduced SSC concentration. Depending upon the degree of 
stringency required, and thus the degree of similarity of 
the sequences, such reduced concentration buffers are 
typically single strength SSC containing 0.1%SDS, half 
strength SSC containing 0 . 1%SDS and one tenth strength SSC 
containing 0-l%SDS. Sequences having the greatest degree 
of similarity are those the hybridisation of which is least 
affected by washing in buffers of reduced concentration. 
It is most preferred that the sample and inventive 
sequences are so similar that the hybridisation between 
them is substantially unaffected by washing or incubation 
in standard sodium citrate (0.1 x SSC) buffer containing 
0. 1%SDS- 

Oligonucleotides may be designed to specifically 
hybridise to N3 3 SEMCAP3 , NPAS3 , GRIK4, PDE4B and/or CDH8 
nucleic acid. They may be synthesised, by known techniques 
and used as primers in PCR or sequencing reactions or as 
probes in hybridisations designed to detect the presence of 
a mutated or normal N33 , SEMCAP3 , NPAS3 , GRIK4 , PDE4B 
and/or CDH8 gene(s) in a sample. The oligonucleotides may 
be labelled by suitable labels known in the art, such as, 
radioactive labels, chemi luminescent labels or fluorescent 
labels and the like. 

The term "oligonucleotide" is not meant to indicate 
any particular length of sequence and encompasses 
nucleotides of preferably at least 10b (e.g. 10b to Ikb) in 
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length, more preferably 12b-500b in length and most 
preferably l5b to 100b. 

The oligonucleotides may be designed with respect to 
any of the sequences described herein and may be 
manufactured according to known techniques. They may have 
substantial sequence identity (e.g. at least 50%, at least 
75%, at least 90% or at least 95% sequence identity) with 
one of the strands shown therein or an RNA equivalent, or 
with a part of such a strand. Preferably such a part is at 
least 10, at least 30, at least 50 or at least 200 bases 
long. It may be an open reading frame (ORF) or a part 
thereof . . 

Oligonucleotides which are generally greater than 30 
bases in length should preferably remain hybridised to a 
sample polynucleotide under one or more of the stringent 
conditions mentioned above • Oligonucleotides which are 
generally less than 30 bases in length should also 
preferably remain hybridised to a sample polynucleotide but 
under different conditions of high stringency. Typically 
the melting temperature of an oligonucleotide less than 30 
bases may be calculated according to the formula of; 2'C 
for every A or T, plus A'C for every G or C, minus 5'C. 
Hybridization may take place at or around the calculated 
melting temperature for any particular oligonucleotide, in 
6 X SSC and 1% SDS . Non specifically hybridised 

oligonucleotides may then be removed by stringent washing, 
for example in 3 x SSC and 0,1% SDS at the same 
temperature. Only substantially similar matched sequences 
remain hybridised i.e. said oligonucleotide and 
corresponding test nucleic acid. 

When oligonucleotides of generally less than 30 bases 
in length are used in sequencing and/or PGR studies, the 
melting temperature may be calculated in the same manner as 
described above. The oligonucleotide may then be allowed 
to anneal or hybridise at a temperature around the 
oligonucleotides calculated melting temperature. In the 
case of PGR studies the annealing temperature should be 
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around the lower of the calculated melting temperatures for 
the two priming oligonucleotides. It is to be appreciated 
that the conditions and melting temperature calculations 
are provided by way of example only and are not intended to 
be limiting. It is possible through the experience of the 
experimenter to vary, the conditions of hybridisation and 
thus anneal/hybridise oligonucleotides at temperatures 
above their calculated melting temperature. Indeed this 
can be desirable in preventing so-called non-specific 
hybridisation from occurring. 

It is possible when conducting PCR studies to predict 
an expected siz^ or sizes of PCR product(s) obtainable 
using an appropriate combination of two or more 
oligonucleotides, based on where they would hybridise to 
the sequences described herein. If, on conducting such a 
PCR on a sample of DNA, a fragment of the predicted size is 
obtained, then this is predictive that the DNA encodes a 
homologous sequence from a test organism. 

Proteins for all the applications described herein can 
be produced by cloning the gene for example into plasmid 
vectors that allow high expression in a system of choice 
e.g. insect cell culture, yeast, animal cells, bacteria 
such as Escherichia coli. To enable effective purification 
of the protein, a vector may be used that incorporates an 
epitope tag (or other "sticky" extension such as His6) onto 
the protein on synthesis. A number of such vectors and 
purification systems are commercially available. 

The polynucleotide fragment can be molecularly cloned 
into a prokaryotic or eukaryotic expression vector using 
standard techniques and administered to a host. The 
expression vector is taken up by cells and the 
polynucleotide fragment of interest expressed, producing 
protein . 

It will be understood that for the particular 
polypeptides embraced herein, natural variations such as 
may occur due to polymorphisms, can exist between 
individuals or between members of the family. These 
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variations may be demonstrated by (an) amino acid 
dif f erence ( s) in the overall sequence or by deletions, 
substitutions, insertions, inversions or additions of (an) 
amino acid(s) in said sequence. All such derivatives 
showing the recognised activity are included within the 
scope of the invention. For example, for the purpose of 
the present invention conservative replacements may be made 
between amino acids within the following groups: 

(I) Alanine, serine, threonine; 

(II) Glutamic acid and aspartic acid; 

(III) Arginine and leucine; 

(IV) Asparagine and glutamine; 

(V) Isoleucine, leucine and valine; 

(VI) Phenylalanine, tyrosine and tryptophan 
Moreover, recombinant: DNA technology may be used to 

prepare nucleic acid sequences encoding the various 
derivatives outlined above. 

As is well known in the art, the degeneracy of the 
genetic code permits substitution of bases in a codon 
resulting in a different codon which is still capable of 
coding for the same amino acid, e.g. the codon for amino 
acid glutamic acid is both GAT and GAA. Consequently, it 
is clear that for the expression of polypeptides from 
nucleotide sequences described herein or fragments thereof, 
use can be made of a derivative nucleic acid sequence with 
such an alternative codon composition different from the 
nucleic acid sequences shown in the Figures. 

The polynucleotide fragments of the present invention 
are preferably linked to regulatory control sequences. 
Such control sequences may comprise promoters, operators, 
inducers, enhancers, silencers, ribosome binding sites, 
terminators etc. Suitable control sequences for a given 
host may be selected by those of ordinary skill in the art. 

A polynucleotide fragment according to the present 
invention can be ligated to various expression controlling 
sequences, resulting in a so called recombinant nucleic 
acid molecule. Thus, the present invention also . includes 
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an expression vector containing an expressible nucleic acid 
molecule. The recombinant nucleic acid molecule can then 
be used for the transformation of a suitable host. 

Specific vectors which can be used to clone nucleic 
acid sequences according to the invention are known in the 
art (e.g. Rodriguez, R.L. and Denhadt, D.T., Edit., 
Vectors: a survey of molecular cloning vectors and their 
uses, Butterworths, 1988, or Jones et al., Vectors: Cloning 
Applications: Essential Techniques (Essential techniques 
series), John Wiley & Son. 1998), 

The methods to be used for the construction of a 
recombinant nucleic acid molecule according to the 
invention are known to those of ordinary skill in the art 
and are inter alia set forth in Sambrook, et al. (Molecular 
Cloning: a laboratory manual Cold Spring Harbour 
Laboratory, 1989) . 

The present invention also relates to a transformed 
cell containing the polynucleotide fragment in an 
expressible form. "Transformation", as used herein, refers 
to the introduction of a heterologous polynucleotide 
fragment into a host cell. The method used may be any 
known in the art, for example, direct uptake, transfection 
transduction or electroporation (Current Protocols in 
Molecular Biology, 1995. John Wiley and Sons Inc.)- The 
heterologous polynucleotide fragment may be maintained 
through autonomous replication or alternatively, may be 
integrated into the host genome. The recombinant nucleic 
acid molecules preferably are provided with appropriate 
control sequences compatible with the designated host which 
can regulate the expression of the inserted polynucleotide 
fragment, e.g. tetracycline responsive promoter, thymidine 
kinase promoter, SV-40 promoter and the like. 

Suitable hosts for the expression of recombinant 
nucleic acid molecules may be prokaryotic or eukaryotic in 
origin. Hosts suitable for the expression of recombinant 
nucleic acid molecules may be selected from bacteria, 
yeast, insect cells and mammalian cells. 
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In another aspect the present invention also relates 
to a method of diagnosing schizophrenia and/or affective 
psychosis or susceptibility to schizophrenia and/or 
affective psychosis in an individual, wherein the method 
comprises determining if SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s) in the individual has been disrupted by 
a m.utation or chromosomal rearrangement. 

The methods which may be employed to elucidate such a 
mutation or chromosomal rearrangement are well known to 
those of skill in the art and could be detected for example 
using PGR or in hybridisation studies using suitable probes 
which could be designed to span an identified mutation site 
or chromosomal breakpoint in close proximity to the/ said 
N33 SEMCAP3, NPAS3, GRIK4, PDE3B and/or CDH8 gene(s), such 
as the breakpoint identified by the present inventors and 
described herein. 

Once a particular polymorphism or mutation has been 
identified it may be possible to determine a particular 
course of treatment. For example it is known that some 
forms of treatment work for some patients, but not all. 
This may in fact be due to mutations in the/said N33, 
SEMCAP3 , NPAS3 , GRIK4 , PDE4B and/or CDH8 gene(s) or 
surrounding sequence, and it may therefore be possible to 
determine a treatment strategy using current therapies, 
based on a patient's genotype. 

It will be appreciated that mutations in the gene 
sequence or controlling elements of a gene, eg. a promoter 
and/or enhance can have subtle effects such as affecting 
mRNA splicing/stability/activity and/or control of gene 
expression levels, which can also be determined. Also the 
relative levels of RNA can be determined using for example 
hybridisation or quantitative PGR as a means to determine 
if the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B and/or CHD8 
gene(s) has been disrupted. 

Moreover the presence and/or levels of the/said 
SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B and/or CHD8 gene(s) 
products themselves can be assayed by immunological 
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techniques such as radioimmunoassay, Western blotting and 
ELISA using specific antibodies raised against the gene 
products. The present invention also therefore relates to 
antibodies specific for a SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE43 
and/or CHD8 gene(s) product (s) and uses thereof in 
diagnosis and/or therapy. 

A further aspect of the present invention therefore 
provides antibodies specific to the polypeptides of the 
present invention or epitopes thereof. Production and 
purification of antibodies specific to an antigen is a 
matter of ordinary skill, and the methods to be used are 
clear to those sKilled in the art. The term antibodies can 
include, but is' not limited to polyclonal antibodies, 
monoclonal antibodies (mAbs) , humanised or chimeric 
antibodies, single chain antibodies, Fab fragments, F(ab')2 
fragments, fragments produced by a Fab expression library, 
anti-idiotypic (anti-Id) antibodies, and epitope binding 
fragments of any of the above. Such antibodies may be used 
in modulating the expression or activity of the particular 
polypeptide, or in detecting said polypeptide in vivo or in 
vitro . 

Using the sequences disclosed herein, it is possible 
to identify related sequences in other animals, such as 
mammals, with the intention of providing an animal model 
for psychiatric disorders associated with the improper 
functioning of the nucleotide sequences and proteins of the 
present invention. Once identified, the homologous 

sequences can be manipulated in several ways common to the 
skilled person in order to alter the functionality of the 
nucleotide sequences and proteins homologous to those of 
the present invention. For example, "knock-out" animals 
may be created, that is, the expression of the genes 
comprising the nucleotide sequences homologous to those of 
the present invention may be reduced or substantially 
eliminated in order to determine the effects of reducing or 
substantially eliminating the expression of such genes. 
Alternatively, animals may be created where the expression 
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of the nucleotide sequences and proteins homologous to 
those of the present invention are upregulated, that is, 
the expression of the genes comprising the nucleotide 
sequences homologous to those of the present invention may 
be increased in order to determine the effects of 
increasing the expression of these genes. In addition to 
these manipulations, substitutions, deletions and additions 
may be made to the nucleotide sequences encoding the 
proteins homologous to those of the present invention in 
order to effect changes in the activity of the proteins to 
help elucidate the function of domains, amino acids, etc. 
in the proteins. .Furthermore, the sequences of the present 
invention may also be used to transform animals to the 
manner described above. The manipulations described above 
may also be used to create an animal model of schizophrenia 
and/or affective psychosis associated with the improper 
functioning of the nucleotide sequences and/or proteins of 
the present invention in order to evaluate potential agents 
which may be effective for combatting psychotic disorders, 
such as schizophrenia and/or affective psychosis. 

Thus, the present invention also provides for screens 
for identifying agents suitable for preventing and/or 
treating schizophrenia and/or affective psychosis 
associated with disruption or alteration in the expression 
of the SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE3B and/or CHD8 gene 
and/or its gene products. Such screens may easily be 
adapted to be used for the high throughput screening of 
libraries of compounds such as synthetic, natural or 
combinatorial compound libraries. 

Thus, the/ said SEMCAP3 , N33, GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s) products according to the present 
invention can be used for the in vivo or in vitro 
identification of novel ligands or analogs thereof. For 
this purpose binding studies can be performed with cells 
transformed with nucleotide fragments according to the 
invention or an expression vector comprising a 
polynucleotide fragment according to the invention, said 
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cells expressing the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or gene(s) products according to the invention. 

Alternatively also the/said SEMCAP3 , N33 , GRIK4 , 
NPAS3 , PDE4B and/or CDH8 gene(s) products according to the 
invention as well as 1 igand-binding domains thereof can be 
used in an assay for the identification of functional 
ligands or analogs for the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , 
PDE4B and/ or CDH8 gene(s) products. 

Methods to determine binding to expressed gene 
products as well as in vitro and in vivo assays to 
determine biological activity of gene products are well 
known. In gene;ral, expressed gene product is contacted 
with the compound to be tested and binding, stimulation or 
inhibition of a functional response is measured. 

Thus, the present invention provides for a method for 
identifying ligands for SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s) products, said method comprising the 
steps of: 

a) introducing into a suitable host cell a 
polynucleotide fragment according to the inventions- 
fa) culturing cells under conditions to allow 
expression of the polynucleotide fragment; 

c) optionally isolating the expression product; 

d) bringing the expression product (or the host cell 
from step b) ) into contact with potential ligands which 
will possibly bind to the protein encoded by said 
polynucleotide fragment from step a) ; 

e) establishing whether a ligand has bound to the 
expressed protein; and 

f) optionally isolating and identifying the ligand. 
As a preferred way of detecting the binding of the 

ligand to the expressed protein, also signal transduction 
capacity may be measured. 

Compounds which activate or inhibit the function of 
SEMCAP3, N33, GRIK4 , NPAS3 , PDE4B and/or CDH8 gene(s) 
products may be employed in therapeutic treatments to 
activate or inhibit the polypeptides of the present 
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invention . 

The present invention will now be further described by 
way of Example and with reference to the Figures which 
show: 

Figure 1 shows an ideogram diagram of the chroiBOsomal 
rearrangement (a reciprocal translocation) in patient 1, 
The two breakpoints are marked at the approximate 
chromosomal locations at which they are located. In 
addition, and not to scale, the two candidate disease- 
causing genes, N3 3 and SEMCAP3 , are placed in the correct 
orientation and with respect to the breakpoints. 

Figure 2 shows a representation of . the .. genomic 
structure of the SEMCAP3 gene: its spliced exons spread 
over a genomic extent of approximately 250kb, 
Above the gene, the coding contribution of each exon to the 
SEMCAP3 protein is indicated by bars and finely dashed 
lines. The domain structure of SEMCAP3 protein is shown at 
the top of the figure- 'RING' refers to a RING-finger 
domain, 'ZF-T.' to a TRAF-type zinc finger (also referred 
to as a sina domain) and 'PDZ' to PD2 domain present in 
PSD~95, Dig, and ZO-1/2, The BAG clones used to identify 
the breakpoint location are included at the bottom of the 
figure together with the inferred direction (arrows) of the 
breakpoint from the FISH results using these clones. The 
heavy dashed line shows the position of the breakpoint with 
respect to the gene exons and the domain structure of the 
protein. 

Figure 3 Nucleic acid sequence of Human SEMCAP3 
(genomic DNA sequence including CpG island/putative 
promoter upstream of 5* UTR/cDNA sequence is also included 
for clarity) , The following features are marked for 
clarity : 

a) ATG start site located at position 709 (underlined) 

b) GG bases (underlined) at the junction between exons 3 
and 4 (i.e. between which the breakpoint is located) 

c) UAA stop codon located at position 3907 (underlined) . 
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Figure 4 Amino acid sequence of Hun>an SEMCAP3 with 
underlined regions of interest. 

a) Residues 18-55 Ring finger domain 

b) Residues 101-158 SINA/ZF-TRAF domain 

c) Residues 246-339 PD2 domain #i 

d) Residues 4 18-504 PDZ domain #2 

Figure 5 shows a schen,atic representation of the N33 
gene : exon splicing and chromosome breakpoint identified 
in the present invention. 

Figure 6 shows the nucleotide sequence of the various 
exons for N3 3. 

Figure 7 shows the various transcript options and 
associated amino acid sequences of the transcripts for N33. 

Figure 8 shows N3 3 protein aligned with other 
homologues . 

Figure 9 shows the effect of the C-terminus of the 
various N33 splice forms. The variety of splice forms at 
the 3 . end of the gene has implications for the C-terminus 
Of the protein. This is especially important when it is 
considered that N3 3 is likely to reside in the Golgi/ER 
compartment of the cell where C-termini are often involved 
in anchoring or trafficking proteins to different 
■organelles. The light grey shading indicates putative 
transmembrane domains. Hence, only the spliceforms with 
exons la/lb.2-6,7,8,9,10,11 or la/lb, 2-6 , 7 , 8 , 9 , 11 are likely 
to encode functional proteins and these will only differ in 
the extreme C- terminal residues. 

Figure 10 shows the published nucleotide sequence for 
GRIK4 . 

Figure 11 shows the published amino acid sequence for 
GRIK4 . 

Figure 12 Breakpoints identified in the subject 
(patient 2). CEPH library YACs (Chumakov et al, 1992) 
spanning the breakpoints are listed. Also detailed are the 
BAC clones (and accession numbers) from the RPCI-li BAC 
library (Osoegawa et al, 2001) that span or flank (indicated 
by dashes) the breakpoints. Breakpoints at 8qi3 were not 
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characterised in this study. 

Figure 13 Representation ^^of complex chroinosoxnal 
rearrangement in the subject (patient 2). The pericentric 
chromosome 2 inversion is coupled with a translocation to 
chromosome 11. The chromosome 11 region between the llq23.3 
and llq24.3 breakpoints is inserted on chromosome 8ql3 . 

Figure 14 Genomic arrangements of the GRIK4 gene 
disrupted in the subject. Two potential GRIK4 transcripts 
with alternative start-sites are indicated. The la/la' exons 
are derived from EST BE388730- The transcript 
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containing the lb exon corresponds to the published GRIK4 
sequence (acc. S67803). It is probable that the present 
inventors exon "4" corresponds to a number of undefined 
exons which can only be subdivided after release of genomic 
sequence over this part of the gene. Hence, the actual 
number of GRIK4 transcript exons will most likely exceed 
14. BAG (grey boxes), cosmid (white boxes) and long-range 
PGR product (black line) derived FISH probes enabled the 
positioning of the breakpoint (arrows indicate the relative 
direction of the breakpoint deduced from the 
presence/absence of the signals on the two derived 
chromosomes) . Probes from BAG RPCl-ll 89P5 and cosmids 
LA11197-C5, LA1153-H6, LA11236-G3 and LA1192-G6 indicated 
that the breakpoint was located near exons 2 and 3. A FISH 
probe synthesized from a long-range PGR product 
corresponding to the intronic sequence between these two 
exons indicated that the breakpoint lies upstream of the 
intron between exons 2 and 3 . 

Figure is 5 ■ sequence of the GRIK4 gene showing the 
two possible N-terminal peptides derived from alternate 
start sites. Exon combination la-la '-2 is derived from an 
EST sequence (acc. BE388730) . Exon combination lb-2 is 
based on the published cDNA sequence (e.g. acc. 567803). 
The actual amino acid sequence may differ from the 
published amino acid sequence as there is a potential 
downstream methionine start (MVAG. . . instead of MPRV...) 
containing a more conserved Ko2ak sequence (Kozak, 1986) . 
It can be seen that the breakpoint upstream of exon 2 will 
separate the majority of the coding sequence from the 
promoter resulting in a putative null allele. Exonic DNA 
sequence is shown in capitals. intronic or upstream 
sequence in lower case. Conserved splice junction 

sequences (EXON/GT AG/EXON) are underlined. Single 

letter amino acid codes are shown beneath the appropriate 
DNA codons. A functional C/G: Leu/Val single nucleotide 
polymorphism (underlined) is found within exon 2. 
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Figure 16 shows the complete alternative nucleic acid 
sequence as identified by the present inventors. 

Figure 17 shows the complete alternative amino acid 
sequence as identified by the present inventors. 

Figure 18 shows the nucleic acid sequence of NPAS3 
spliceform 1. 

Figure 19 shows the protein sequence of NPAS3 
spliceform l. 

Figure 2 0 shows the nucleic acid sequence of NPAS3 
spliceform 2. 

Figure 21 shows the protein sequence of NPAS3 
spliceform 2 . 

Figure 22 shows an ideogram representation of the 
balanced translocation in patient 3 relating to this 
invention. 

Figure 2 3 shows the genomic arrangement of the NPAS3 
gene including the position of the observed breakpoint. 

Figure 24 shows potential functional consequences of 
the disruption to NPAS3 gene : dominant-negative activity. 

Figure 25 shows the PDE4B1 nucleic acid sequence. 

Figure 26 shows the PDE4B1 protein sequence. 

Figure 27 shows the PDE4B3 nucleic acid sequence. 

Figure 28 shows the PDE4B3 protein sequence. 

Figure 29 shows the PDE4B2 nucleic acid sequence. 

Figure 30 shows the PDE4B2 protein sequence- 
Figure 31 a) Ideogram representation of balanced 
translocation between chromosomes 1 and 16 in patient 4. 

Figure 32 Genomic arrangements of the PDE4B gene 
disrupted in the subject (patient 4) . The two long 
transcripts of the PDE4B gene are shown. FISH showed the 
breakpoint was within a gap in the genome sequence between 
BACs RPCI-11 433N2 and RPCI-11 44211, This positioned the 
breakpoint between the first and second exons of the PDE4B1 
form of the gene (acc. L20966) . A long-range PGR product 
FISH probe corresponding to the genomic region encompassing 
the la exons of PDE4B1 confirmed that the gene was 
disrupted between exon pairs la and exon 2 (i.e. only 
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PDE4B1 transcripts are directly disrupted by the chromosome 
abnormality) . 

Figure 33 shows an ideogram diagram of the chromosomal 
rearrangement (a reciprocal translocation) in patient 4. 
The two breakpoints are marked at the approximate 
chromosomal locations at which they are located, in 
addition., and not to scale, the two candidate disease- 
causing genes, PDE4B and CDH8 , are placed in the correct 
orientation and with respect to the breakpoints. The fusion 
genes on derived chromosomes 1 and 16 that result from the 
reciprocal translocation are also indicated, demonstrating 
the potential capacity for fusion transcript/protein 
synthesis- 
Figure 34 shows a representation of the genomic 
structure of the CDH8 gene: its spliced exons spread over 
a genomic extent of approximately 400kb. 

Above the gene, the coding contribution of each exon to the 
CDH8 protein is indicated by bars and finely dashed lines. 
The domain structure of CDH8 protein is shown at the top of 
the figure. 'N* and 'C* refer to the N- and C-termini of 
the protein. The broken line at the N-terminus indicates 
the existence of signal peptide and proprotein domains - 
both of which are cleaved off in the mature protein. The 
'CD' ovals represent the positions of the five 
extracellular cadherin domains. The black box signifies the 
position of the hydrophobic stretch of amino acids that act 
as the membrane-spanning domain. The BAC clones used to 
identify the breakpoint location are included at the bottom 
of the figure together with the inferred direction (arrows) 
of the breakpoint from the FISH results using these clones . 
The heavy dashed line shows the position of the breakpoint 
with respect to the gene exons and the domain structure of 
the protein. 

Figure 35 Nucleic acid sequence of Human CDHB . The 
following features are marked for clarity: 

a) ATG start site located at position 253 (underlined) 

b) GC bases (underlined) at the junction between exons 1 
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and 2 (i.e. between which the breakpoint is located) 

c) UGA stop codon located at position 2650 (underlined) . 

Figure 36 Amino acid sequence of Human CDH8 with 
underlined regions of interest. 

a) Residues 1-29 signal peptide domain (italics) 

b) Residues 30-61 propeptide fragment cleaved off in mature 

c) Residues 76-158 cadherin domain #1 (underlined) 

d) Residues 172-248 cadherin domain #2 (underlined) 

e) Residues 281-383 cadherin domain #3 (underlined) 

f) Residues 396-487 cadherin domain #4 (underlined) 

g) Residues 500-597. cadherin domain #5 (underlined) 

h) 'V highlighted at position 513 is the last residue in 
common with the putative truncated rat protein product from 
the alternatively spliced form. 

i) Residues 622-645 transmembrane domain #1 (underlined) . 

Figure 3 7 

a) Fusion protein product resulting from CDH8 promoter/exon 
1 spliced to PDE4B exon 2 and beyond (transcribed on 
der(16)) - The underlined residues * RV * represent the fusion 
site between the two genes. 

b) Fusion protein product resulting from PDE4B promoter 
(long form) /exon la spliced to CDHB exon 2 and beyond 
(transcribed on der(l)). See text for details: only the 
reading frame producing the N-terminal truncated form of 
the CDH8 protein is shown- The underlined 'gc' at position 
68 represents the point of fusion between the two genes . 
Three potential methionine translation start sites are 
shown (highlighted) with the second of these having a 
nucleic acid sequence most similar to the canonical Kozak 
sequence (underlined) , Use of this start site would 
generate a truncated CDH8 protein lacking the signal 
peptide, proprotein fragment, cadherin domain 1 and most of 
cadherin domain 2. 
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Materials and methods 

Lymphocyte extraction and metaphase chromosome preparation 

Lymphocytes were extracted from 7mls of patient blood 
(for storage and generation of EBV-transf ormed cell lines) 
using density gradient separation (Histopaque-1077 , Sigma) . 
In order to generate metaphase-arrested chromosomes for 
cytogenetic analysis, O.Smls of patient blood were cultured 
for 71hrs in medium containing phytohaeraagglut inin 
(Peripheral Blood Medium, Sigma) . The short-term cultures 
were treated with colcemid for one hour followed by a 
conventional fixing procedure. Fixed chromosomes were 
dropped onto microscope slides and stored for 1 week prior 
to use in FISH experiments. 

Selection of YAC clones for FISH probe synthesis 

YAC clones were selected from the Whitehead/MIT map of 
the relevant chromosome in the cytogenetic intervals within 
which the breakpoints were adjudged to lie. YACs were 
obtained from the HGMP Resource Centre, Babraham 
B i o i n c u b a t o r , Babraham, Cambridge, UK 

(http: //www. hgmp.mrc, ac.uk/ ) . Clone DNA was prepared by 
standard methods and PCR amplified using primers designed 
against consensus sequence elements within the archetypal 
Alu repeat, Breen et al, 1992. This "Alu-PCR" gives a 
representative spread of non-repetitive sequence over the 
full length of the YAC and generates a better FISH probe 
than native YAC DNA. Alu-PCR was performed using the Expand 
Long Template PCR kit (Roche) . Cycling conditions : 94'*C - 
45s, 55°C - 30s, GS'^C - Smin: 35 cycles. 68°C - lOmin final 
extension. 

Fluorescence in situ hybridisation (FISH) protocol 

Probe template DNA (pooled Alu-PCR products, BAC clone 
DNA, cosmid clone DNA or long-range PCR products) were 
labelled by nick translation and hybridised to patient 
metaphase spreads using standard FISH methods. Slides were 
counterstained with DAPI in Vectashield anti-fade solution 
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(Vector laboratories) . A Zeiss Axioskop fluorescence 
microscope with a chroma number 81000 or 830000 multi- 
spectral filter set was used to observe the chromosomal 
hybridisations. Images were captured using Vysis 
SmartCapture extension running within IP Lab spectrum or 
digital Scientific SmartCapture imaging software. FISH 
signals observed on derived chromosomes dictated the 
selection of further clones required to "walk" towards the 
breakpoint. Breakpoint-spanning FISH probes have signals on 
a normal chromosome and on both derived chromosomes. 

Resolution of breakpoint position 

BAG clones corresponding to positive YAC regions were 
arranged into contigs by consulting the Washington 
University FPC 

(http : / / www - genome . wustl . edu/gsc/ human/ Mapping/ index . shtm 
1) , UCSC GoldenPath Draft Human Genome Browser 
(http : / /genome . ucsc - edu/goldenPath/hgTracks . html ) and 
Ensembl (http://www.ensembl.org/) databases. BAG clones 
were supplied by BACPAC Resources, Oakland, Galifornia, USA 
(http://www.chori.org/bacpac/). Clone selection was biased 
to gene-containing BACs . Once a breakpoint-spanning BAG was 
identified, the position of the breakpoint in relation to 
candidate gene exons was determined by FISH probes 
generated from chromosome-specific library cosmids (HGMP 
Resource centre) or precisely positioned, repeat element- 
free long-range PGR products (Expand long range PGR kit, 
Roche; see below for primer sequences). Gycling 
conditions: 94''C - 45s, 52°G - 30s, 68°C - llmin: 35 cycles. 
68°G - ISrain final extension. Gosmids were isolated by 
probing the appropriate chromosome-specific library filters 
(HGMP-RG) with isotopically labelled exon-specif ic PGR 
products • 
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Example i: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 1 

FISH experiments on chromosome 3pl3 had narrowed the 
location of the breakpoint to a region including the large 
gene SEMCAP3 (approximately 250kb genomic extent) . Two BAG 
clones were selected from the tiling diagram of BAG clones 
placed on the human genome map backbone (June 2002 release 
of the 'BAG End Pairs' track on the UGSC Genome Browser ; 
http: / /genome. cse,ucsc. edu/ index. html? ora=Human) . These 
were RPCI-11 606pl6 and RPCI-11 94j25. By FISH, these BAG 
clones flanked the breakpoint (the former translocated to 
the derived chromosome 8 and the latter remained on the 
derived chromosome 3) . The position of these two BAG clones 
indicated that the breakpoint lay within the large (200kb) 
intron between exons 3 and 4 of the SEMCAP3 gene (see 
Fig. 2). Thus, the inventors . inferred from these results 
that the SEMCAP3 gene was directly disrupted by the 3pl3 
translocation event and, as such, is a candidate gene for 
the psychiatric disorder exhibited by the patient. 

Semcap3 (semaphorin cytoplasmic domain-associated 
protein) was originally identified in mouse as a gene 
encoding a protein that interacts with M-semF/Sema4c . Two 
forms^ 3A and 3B, were submitted to the public nucleic acid 
sequence database (Wang & Strittmatter , 1999) but have yet 
to be published. It appears that 3b may be an artifactual 
sequence as it displays deletions in the sequence. Sema3a 
is identical in structure to the predicted human gene, 
KIAA1095 and the inventors refer to this sequence as human 
SEMCAP3 . The yeast two-hybrid screen that isolated Sema3a/b 
also identified Semal and Sema2 as genes encoding proteins 
which interact with the cytoplasmic tail of the SEMA4G 
protein (Wang et al., 1999). 

The purpose of these screening experiments was to 
elucidate cytoplasmic interactors with the transmembrane 
receptor, SEMA4C. This protein belongs to a large group of 
signalling proteins described as ' semaphorins ' . In the 
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brain, these proteins are thought to play important roles 
in brain development through their action on axonal 
guidance and growth cone stability. Inagaki et al., (1995) 
showed that Sema4C is expressed in the developing mouse 
brain. One proposed explanation for the origin of 
psychiatric disorders (including the disorder exhibited by 
the patient described here) is the incorrect development of 
the brain, particularly the connections, projections and 
neural networks between brain subregions . With this in 
mind, semaphorins, and the proteins that interact with them 
(such as the SEMCAPs) , become attractive candidate genes 
for the psychiatric disorders. 

It is suspected that the PDZ domains (see Fig. 2) of 
the SEMCAP3 protein will be involved in protein-protein 
interactions (such as SEiyiA4C interaction) as they are in 
other proteins. The RING-finger domain of SEMCAP3 
identifies it as belonging to a class of proteins known as 
ubiquitin ligases. Ubiquitin ligases specifically target 
proteins for ubiquit ination and subsequent destruction in 
the proteasome pathway. Thus, SEMCAP3 may act to regulate 
the activity of other proteins (for instance, components of 
the semaphorin pathway) by targeting them for destruction. 
The ZF-TRAF/SINA domain is most likely an extension of the 
RING-finger domain . 

Figure 2 shows that the breakpoint would end SEMCAPS 
transcription after the third exon on the derived 
chromosome 3 (there would still be one normal chromosome 3 
and SEMCAP3 gene remaining in each nucleus) . If 
transcription occurs on the derived chromosome 3 then the 
resulting translated protein product would be truncated; 
lacking part of the first PDZ domain and all subsequent 
amino acids in the C-terminal direction. It remains to be 
investigated if the psychiatric disorder in this patient 
results from N33 perturbation on one allele, the disruption 
of SEMCAP3 on one allele, the generation of an aberrantly 
functioning truncated SEMCAP3 from one allele or a 
combination of these. 
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Pulver et ai . ( 1995) detailed schizophrenia linkage to 
chromoson^e 3p (albeit telomerxc to SEMCAP3). However two 
further studies have failed to replicate these findings in 
different populations (Maziade et ai . , 2001 & Hovatta et 
aJ . , 1998). 



Example 2: Further molecular characterisation of 
Chromosomal disruption and identification of disrupted gene 

In this case, primers corresponding to N33 3 - UTR 
sequences and an STS , SHGC-12093 (Ace/ No. G17275) were 
designed (see below for primer sequences). These PCR 
products were used to screen the chromosome 8 specific 
cosmid library (LAOS). Among others, positive cosmids 
LA0854-H5 (3- UTR) and LA08145-E3 (STS) were isolated and 
subsequently used in FISH experiments (see below for 
results) . 

3 'UTR primers 

Primer A: TGCCACGTGTTAGCAGAAAG 
Primer B: TGCCTTTAACCAGATGAGGC 

SHGC-12093 primers 

Primer A: TCTTGTGGGTCACAATTAGGC 
Primer B: TAAAAAGGTGCAGTTTCTTCAGC • . 

The subject has schizoaffective disorder and a 
balanced reciprocal translocation between chromosomes 3 and 
8. A 8P22 breakpoint-crossing YAC, 931_a_l, was identified. 
This permitted a 8p22 breakpoint-crossing BAG RPCI-li 23jl4 
(acc. no. AC019292) to be found. This was shown to contain 
the 3. end of the N33 gene (Fig. 6) . Subsequently, FISH with 
cosmids LA0854-H5 and LA08145-E3 from the lanL chromosome 
8 specific library (HGMP Resource Centre, Babraham, 
Cambridge, UK) flanked the breakpoint, placing it 
approximately lOOKb from exon 11 of N33. N33 is related to 
a number of genes, human IAG2, Drosophila CG7830, C. 
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elegans g304348 and two yeast proteins, OST3 and 0ST6 (see 
Fig. 8 for alignment of proteins). While the homologies 
between N33 and the yeast proteins are relatively weak, 
they share conserved cysteine residues and have the same 
locations for the four transmembrane domains as predicted 
by hydropathy plots. Ost3 and Ost6 are components of the 
oligosacchary 1 transferase complex responsible for the 
addition of oligosaccharides to selected proteins. This has 
been backed up by protein structure prediction programs 
detailed in a recent report Fetrow et al, 2001. 

The present inventors have identified an alternative 
start exon, herein identified as exon la (see Figures 5 & 
6) to that in the public database, herein identified at 
exon lb. Additionally they have identified a. complex 
variation of splicing with the exons and proposed sequences 
of the transcripts, shown in Figures 5, 6 and 37 
respectively. In view of the complex splice variations the 
C-terminal sequence of the various N3 splice forms is 
predicted to vary and this is shown in Figure 9. 

Because N33 lies within a linkage hotspot for 
schizophrenia (Curling et al, 2001, Brzustowicz et al, 
1999, Blouin et al, 1998, Kaufmann et al, 1998^ Kendler et 
al, 1996, Pulver et al, 1995) the present inventors decided 
to carry out an association study on this gene. Three 
microsatellite markers (D8S549, N33 microsatellite and 
D8S1992 

Microsatellites used in associated study 
D8S549 

Primer A: AAATGAATCTCTGATTAGCCAAC 
Primer B: TGAGAGCCAACCTATTTCTACC 

N3 3 microsatel 1 ite 

Primer A: AGGCTGAGTGCCAAAAAGTA 

Primer B: CTTTAAGCTTGCTATTTGAAGGC 

D8S1992 

Primer A: TTCATCGTCTGAACCTGG 
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Primer B: ACACATTTCCTCTATGTTGC) were chosen and used to 
type 25 mother-father-schizophrenic proband trios and 64 
schizophrenic cases and 64 normal controls. The haplotypes 
derived from the trio study were examined for frequency 
bias in the case and control samples. Certain haplotypes 
are currently over-represented in the schizophrenic case 
genotypes compared to controls. Appropriate individuals 
with the haplotypes are currently being screened for 
mutations . 



Example 3: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 2 

Psychiatric evaluation 

The subject (female) was approached and gave full, 
informed written consent for this study as one of a large 
cohort of people co-morbid for schizophrenia and mental 
retardation. Prior to investigation she was not known to 
have any abnormality of karyotype. She suffered from 
chronic schizophrenia and a mild degree of mental 
retardation (IQ between 65-70) . The diagnosis of chronic 
schizophrenia was confirmed using SADS-L structured 
interview to generate DSM-IV and ICD-lO criteria, by a 
psychiatrist experienced in both general psychiatry and the 
psychiatry of mental retardation (WM) . SADS can be 
reliably used in patients with mild mental retardation. 
Consensus diagnosis was reached on review by two 
psychiatrists (WM and DB) . IQ scores were generated from 
WAIS-R and their stability shown by similar levels detected 
by psychological examination at different times throughout 
her life. There were no dysmorphic features in the 
subject. However the subject did suffer from bilateral 
deafness since childhood - a consequence of surgical 
operations on the mastoids. There was no family history of 
mental illness or mental retardation that could be 
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ascertained. Other members of the family declined to 
participate in the study. 

An initial G-banded karyotype of this patient 
indicated that the chromosome abnormality was complex (46, 
XX,ins(8;ll) (ql3;q23.3q24.2)inv(2) (pl2q32.1) 
t(2;ll) (q21.3;q24.2)der(2) {2qter->2q32.1::2pl2- 
>2q21 . 3 : : llq2 4 . 2->llqter ) der (11 ) ( 1 lpter-> 1 lq2 3 • 3 : : 2q21 . 3- 
>2q32.1: :2pl2->2pter)der(8) (8pter->8ql3::llq23.3- 
>llq24 . 2 : : 8ql3->8qter) ) , involving a pericentric inversion 
of chromosome 2 coupled with rearrangements involving 
chromosomes 2, 8 and 11 (Fig,13). Figure 12 details the 
YAC and BAG FISH probes crossing or bracketing breakpoints 
on 2 and 11. Sequence in the locality of the breakpoints 
was assessed for gene content. 

PGR primers 

Long-range PGR for FISH probe templates: 
Int2-3 GRIK4a; CAGGAGGTCCTGTGAAGCTC , 
Int2-3 GRIK4b; ACAGGGAAAGAAGCAAAGCA . 

GRIK4 exon region-specific PGR: screening of chromosome 11 
cosmid libraries: 



Exla/a» 


a; 


AAAGCTAAGCGCAGGTGTGT , 


Exla/a ' 


b; 


TTTCTGGGAGGCAACCATAG , 


Exlb 


a; 


GCAGAGTTATGTCATGCCCA , 


Exlb 


b; 


CGTGTGCAGGACTCTGATGT , 


EX2/3 


a; 


TTGAACCCAAGAGAACAGGG , 


EX2/3 


b; 


TCCCCTTCTCCTTCCAGTTT 



Cycling conditions : 94°C - 2niin initial denaturation . 94*'C - 
imin, 52*C - Imin, 72*'C - 75s: 33 cycles, 72''C - ISmin final 
extension . 

The llq23.3 breakpoint is located at a locus 
containing a kainate-type ionotropic glutamate receptor 
(GRIK4, acc. S67803 & NM_014619 (11); previous nomenclature 
KAl /EAAl) . Cosmid FISH directed at the individual exons 
and an intron-specif ic long-range PGR product FISH (Fig. 15) 
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positioned the breakpoint within the GRIK4 gene sequencer- 
most likely immediately upstream of exon 2 (our 
nomenclature. Fig. 15). This was confirmed using a long- 
range PGR product FISH probe corresponding to the intron 
between exons 2 and 3 (Fig. 15). We also identified a 
GenBank EST (acc. BE3S8730, IMAGE clone 10:3613199) 
generating an alternative start-site resulting in an 
alternative cognate N-tern\inal peptide sequence (Figures 16 
and 17). The position of a breakpoint anywhere between 
exonsla/a ' /ItD and exon 3 would truncate all putative 
transcript forms such that no receptor function could be 
encoded on the derived chromosome 11. Hence, the patient 
had only one intact GRIK4 allele . 

Discussion 

The present inventors identified a subject with 
comorbin schizophrenia with mild learning disability in 
whom chromosome translocation events have disrupted brain- 
expressed gene that are also functional disease candidates. 
Without wishing to be bound by theory it is hypothesised 
that the disruption of the GRIK4 gene by a chromosomal 
breakpoint (and the resulting reduced gene dosage) is the 
principal underlying cause of psychiatric disease in this 
patient- 

The gene disrupted in this patient is both expressed 
in the brain and participates in key physiologicail 
processes in the CNS. Notably, the gene may be involved in 
the alteration of the strength of synaptic/neural 
transmission, a phenomenon known as long-term potentiation 
(LTP) . LTP is postulated to underlie cognitive functions 
such as learning and memory. Moreover, cognitive testing 
has previously established that these functions are 
frequently affected in patients with schizophrenia. 
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GRIK4 

Three classes of ionotropic glutamate receptors have 
been identified on the basis of their pharmacological 
profiles and sequence homologies; NMDA receptors, AMPA 
receptors and Kainate receptors. Functional Kainate 
receptors in vivo may be heteromer ic , consisting of 
combinations of the low kainate agonist affinity (GLUR5, 
GLUR6 and GLUR7) and high-affinity subunits (GRIK4 and 
GRIK5) (Chitta jallu et al, 1999; Lerma et al, 2001 and 
Werner et al, 1991) . The subject with comorbid 

schizophrenia and mild learning disability possesses a 
complex chromosomal rearrangement . Of all the breakpoints 
studied in this patient only the GRIK4 gene is directly 
disrupted. This might be expected to modify kainate 
receptor channel properties by altering subunit 
stoichiometry . 

The glutamate receptors are key initiators of synaptic 
LTP (Miller and Mayford, 1999). NMDA receptors are the 
principal mediators of LTP but recently presynaptic kainate 
receptor-dependent plasticity changes have been observed at 
mossy fibre synapses in the hippocampus (Contractor et al, 
2001 and Lauri et al, 2001). Interestingly^ an involvement 
of the glutamate neurotransmitter system in the 
pathophysiology of schizophrenia has been postulated. The 
"Glutamate Hypothesis" attempts to explain the psychotic 
symptoms that arise following administration of ionotropic 
glutamate receptor antagonists such as phencyclidine (PCP; 
"Angel Dust") and ketamine (Gof f and Nine, 1997) . Several 
studies also point to changes, predominantly decreases, in 
glutamate receptor subunit expression (including kainate 
receptors) in the brains of schizophrenic patients (Ibrahim 
et al, and Meador-Woodruf f , 2001). Similarly, Mohn et al, 
1999 report that mutant mice with reduced NMDARl (another 
glutamate receptor) expression levels display 

schizophrenia- like behaviours . 
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As well as aberrant neurotransmission function in the 
adult, it has been suggested that neurodeve lopmental 
deficits may contribute to schizophrenia. Neuroanatomical 
studies indicate statistically significant reduced volumes 
of brain regions, primarily the hippocampus, in 
schizophrenic and comorbid patients (Sanderson et al 1999 
and Pearlson, 1999) . GRIK4 is expressed in the amygdala, 
hippocampal formation (CA3 pyramidal and dentate granule 
cells) and entorhinal cortex. Glutamate receptors might 
mediate brain development through the activity-dependent 
refinement of neuronal connections. 

The present, subject was clinically diagnosed as having 
schizophrenia coupled with mild learning disability. It 
may be the case that causative gene mutations in comorbid 
patients lead to a severe phenotype or have more profound 
downstream effects than gene mutations in patients with 
schizophrenia alone (i.e. the comorbid state represents the 
severest form of schizophrenia ( Doody et al, 1998)). A 
second possibility is that the gene mutation gives rise to 
the learning disability component of the illness through an 
independent effect on brain development. The manner in 
which the mutated genotype gives rise to the observed 
phenotype (via functional or developmental mechanisms) is 
a key issue in molecular neurobiology, particularly in the 
characterisation of mouse "knockout" mutants (Mayford et 
al, 1995) . 

A large number of publications detail family and 
population-based linkage studies carried out to identify 
psychiatric illness susceptibility loci. The results have 
not been conclusive perhaps indicating the presence of 
confounding factors such as population stratification, 
incomplete penetrance, genetic heterogeneity and uncertain 
mode of inheritance. Nevertheless, GRIK4 lies at. the edge 
of a schizophrenia linkage region described in a recent 
publication (Curling et al, 2001). The most centromeric 
marker exhibiting linkage to schizophrenia in this paper. 
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D11S925, is located within an intron at the 3* end of GRIK4 , 

Example 4: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 3 

Fine FISH mapping of the breakpoint with cosmid clones 

PCR products corresponding to regions in or near 
hNPAS3 exons 4, 5 and 6 were obtained using the following 
primers under standard PCR conditions (Exon 4-i 
ACAACCATTCTGGGAACAGC , Exon 4-ii GTGTAGGGAAAGCCATCCAA, Exon 
5-i TCTTTTTCCTGCAGTCCCTG, Exon CTCCAAATGACTCCTGCCAT , 

Exon 6-i GCCTCTGCCATAGATTTTGC , Exon 6-ii 

TTCCTTCCCACCCTTTCTCT) - Probes were created by random- 
primed labelling of PCR products with radioactive dCTP; 
these were used to screen a LANL chromosome 14-specific 
cosmid library (LA14NC01 obtained from the UK HGMP Resource 
Centre, Hinxton, Cambridge) using hybriding conditions set 
out in Church and Gilbert (1986) • Positive clones (exon; 
LA1431-G5, exon 5: LA14123 - C4 and exon 6; LA1487 - D9) 
were prepared by a standard alkaline lysis protocol and 
taken through FISH analysis as above. 

Results 

Metaphase spreads from EBV-transf ormed cell lines were 
analysed by Fluorescence in situ Hybridisation (FISH) using 
successively smaller DNA probes- A breakpoint spanning BAC 
clone was obtained by FISH screening (RPCI-11 BAC 1078114, 
acc. no. AL161851) . EST sequences were examined in the 
genomic DNA flanking the breakpoint in order to identify 
potential transcripts in the locality. A number of ESTs 
were identified which had been annotated as containing 
homologous sequence to the conserved "PAS" domain present 
in a large number of genes (Gu et al, 2000), A search of 
such genes revealed that the most closely related gene 
encoded a mouse brain-expressed transcript, neuronal pas 
domain protein 3 (NPAS3 (M0P6) , acc, no- AF137871; 
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hereafter referred to as mNPAS3) . Nucleotide homology to 
the niNPAS3 cDNA within human genomic DNA BAG clone 
sequences at 14ql3 using the BLAST algorithm identified 12 
exons corresponding to the human orthologue of mNPAS3 
{hNPAS3) distributed over a genomic region of approximately 
800-900Kb making it among the largest gene loci in the 
human genome (Figure 23) . Subsequently, full length hNPASS 
cDNA sequences have been submitted by two other groups to 
GenBank/EMBL with the accession numbers, AB054575 and 
AF164438, although these have differences to the mouse 
splice-form in the 5' exons. This is due to the presence 
of two alternative transcription start sites employed in 
both human and mouse genes. This was confirmed by analysis 
of published cDNA and EST sequences coupled with further 
sequencing of corresponding IMAGE clones. These splice 
variants are highlighted in Figures 18, 30 and 23. 

The ratio of fluorescent signals on the derived 
chromosomes 9 and 14 from the breakpoint-spanning BAG 
probe, 1078il4, indicated that the breakpoint was located 
at the centromeric end of the BAG, This is the location of 
exon 5 of the gene. Exon 4~, 5- and 6-containing cosmids 
were isolated and used as FISH probes to provide definitive 
proof of the location of the breakpoint and confirmation 
that a full-length transcript (and hence protein) cannot be 
synthesized on the derived chromosome 14, An exon 5- 
containing cosmid (see Figure 23) spanned the breakpoint. 
Subsequently a long-range PGR product-derived FISH probe 
corresponding to exon 5 indicated that the breakpoint lay 
upstream of exon 5. 

Long-range PGR primers - NPAS3 exon 5 

a ) ccagcttgtatgtggtgtgg 

b) ttactcccagtgcccattgt . 
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Discussion 

A FISH-based approach has shown that the gene, NPAS3 , 
is disrupted by a chromosomal rearrangement present in a 
mother and daughter who suffer from comorbid schizophrenia 
and learning disability respectively. NPAS3 is a brain 
expressed transcription factor of the basic helix-loop- 
helix PAS domain class which includes members such as AHR 
and ARNT. 

Neuronal pas3 (NPAS3) was originally cloned in the 
mouse (Brunskill et al, 1999) on the basis of its sequence 
homology with other PAS domain proteins. Its expression 
has been characterised in the developing mouse embryo where 
high levels are seen in the neural tube, neuroepithelium 
and, later, the neopallial layer of the cortex. Non-neural 
expression was also observed in the heart, limb and kidney. 
In the mouse, NPASl (human chromosomal location, 19ql3) is 
expressed in deep pyramidal cortex cells, hippocampus and 
amygdala (Zhou et al., 1997). NPAS2 (human chromosomal 
location, 2ql3) is expressed in the cortex, hippocampus and 
thalamus. Lower levels were also seen in spinal cord, 
intestines and uterus. NPAS2 was also recently deleted in 
mice by homologous recombination (Garcia et al., 2000) 
leading to deficits in cued and contextual memory. In 
addition NPAS2 appears to have a role in cellular energy 
state monitoring and the circadian rhythm pathway (Reick et 
al, 2001 and Rutter et al, 2001). The translocation event 
described herein disrupts the gene between exons 4 and 5. 
If transcription occurred at this disrupted locus, a 
truncated protein would result containing only the bHLH 
domain. It is conceivable that this protein would have a 
dominant negative effect on wild-type NPAS3 protein (or any 
other heterodimeric protein partner) through the creation 
of non-functional dimers (see Figure 24 for explanatory 
diagram) . This would result in a potentially more severe 
or penetrant phenotype than a conventional point mutation. 
Two examples where bHLH-PAS proteins have been altered 
through loss of the C-terminal PAS domain (one 
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experimentally, the other in a patient with a chromosome 
translocation) have resulted in probable dominant negative 
action (Maemura et al, 1999, Holder jr. et al, 2000). 

Mutations in this gene in karyotypically normal 
individuals would not be expected to have as severe or 
penetrant effects as those observed in the two t(9;l4) 
patients . 

Sequence comparison between hNPAS3 and other members 
of the NPAS sub-family show that homologies are largely 
restricted to the N-terminal end of the protein; the 
location of a basic helix-loop-helix and PAS domains- The 
greatest homology is with NPASl, then NPAS2 and other PAS 
domain-containing proteins (data not shown) . An alignment 
of the cognate human (conceptually translated from the 
splice-form containing exons 1-12) and mouse NPAS3 proteins 
reveals near identity over the N-terminal half of the 
protein but increased divergence at the C-terminal end. 
This is particularly the case for two stretches where 5 and 
7 amino acids, respectively, have been gained in the human 
orthologue (Fig. 21). These correspond to two poly-glycine 
tracts present within exon 12 (of ii and lO residues 
respectively) . Such tracts can be indicative of slipped 
strand mispairing whereby trinucleotide repeats are 
aberrantly expanded or deleted. Where they occur in coding 
sequence, increases in the number of trinucleotide repeats 
can have a pathological effect on protein function (e.g. 
Huntington disease and Spino-cerebellar ataxia 1) . Another 
feature of such repeats is their unstable nature between 
generations: a lowering of the age of onset of a disease 
from generation to generation (anticipation) can often be 
directly linked to an increase in the number of repeat 
units . 

Exon 12 (coding for the C-terminus of the protein) is 
also noteworthy because of the extremely high density of 
CpG dinucleotides (in humans and mouse); a feature that 
abruptly ends at the junctions with flanking intronic/3' 
sequences. This "CpG island" is unusual because it is both 
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transcribed and also located at the 3* rather than 5' end 
of the gene. The significance of this in terms of 
potential transcriptional control by methylation or 
susceptibility to mutation is as yet unknown. However, the 
high level of G and C bases creates a bias in amino acid 
composition such that alanine, glycine, histidine and 
proline are over-represented. This may explain the 
presence and expansion of the poly-glycine tracts in Npas3. 

14ql3 is also the site of linkage to Fahr ' s syndrome 
(idiopathic basal ganglia calcification; IBGC) as 
determined from analysis of families (Geschwind et al, 
1999). Fahr ' s syndrome symptoms are often accompanied by 
psychoses such as schizophrenia. Thus, it may be the case 
that NPAS3 is also the gene responsible for Fahr ' s 
syndrome - 

Example 5: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 4 

Psychiatric evaluation 

The subject (male) is the proband in a family 
segregating a t(l;16) balanced reciprocal translocation. 
He gave full informed consent to the study. His diagnosis 
of chronic schizophrenia was confirmed by SADS-L structured 
interview and a consensus reached by two psychiatrists (WM 
and DB) . He does not have mental retardation. Other 
members of his near family also gave consent to participate 
in this study, none of whom had current mental illness 
(several are below the age of risk for psychiatric 
illness) . There was also a history of mental illness 
(major depressive disorder) in members of the extended 
family who were known to be translocation carriers, but 
they could not be approached for confirmation at the time 
of the current study. An unrelated individual (now 
deceased) with DSM-IV chronic schizophrenia without 
learning disability also had a t(l;16) balanced 
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translocation with the same breakpoints (at the resolution 
of G-banding) . 

PGR primers 

Long-range PGR for FISH probe templates: 
PDE4B3a; GT C A G A C A A AT C C A A A TGG AG AG , PDE4B3b; 

CTTTCTCCTGTCACTTTCCTTCA . 

Cycling conditions : 94°C - 2inin initial denaturation. 94°C - 
imin, 52°C - imin, 72°C - 75s: 33 cycles. 72°C - 15min final 
extension . 

The balanced translocation, t ( 1 ; 16) (p3 1 . 2 ; q21) , in 
this family results in two breakpoints (Figure 33). 
Genomic sequence" at 16q21 is not complete. The only known 
gene in the vicinity of the breakpoint region is Cadherin 
8 {CDH8, acc. AB035305) . 

In contrast, on chromosome lp31.2 FISH identified two 
non-overlapping BAG clones {RP11-433N2, acc. AL513493 and 
RP11-442I1, acc. AL391359) which reside on either side of 
the breakpoint in this patient. The breakpoint-containing 
genomic region between these two BAG clones has yet to be 
sequenced (see Figure 32) . Database annotation of the two 
BAG clones together with BLAST mapping of exons onto 
genomic sequence indicated that this locus contains a cAMP 
phosphodiesterase gene, PDE4B . Two cDNAs corresponding to 
longer transcript forms of this gene (denoted PDE4B1, acc. 
L20966 and PDE4B3 , acc. U85048, respectively) have been 
previously characterised (Bolger et al, 1994; Huston et al, 
1997) . Long-range PGR product FISH (Figure 32) confirmed 
that the PDE4B1 transcript is directly disrupted by the 
breakpoint (although additional position-effect 
perturbation of PDE4B3 expression cannot be ruled out). 
Huston et al. (1997) have previously shown that the PDE4B1 
transcript encodes an alternative N-terminal peptide 
sequence. In addition, they demonstrated that only this 
form is expressed in the brain. It is therefore predicted 
that this patient will have a reduction in the levels of 
functional PDE4B in the brain. 
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Discussion 

The present inventors have identified a subject with 
DSMIV chronic schizophrenia in whom chromosome 
translocation events have disrupted brain-expressed genes 
that are also functional disease candidates. Without 
wishing to be bound by theory it is hypothesised that the 
disruption of the PDE43 gene by a chromosomal breakpoint 
(and the resulting reduced gene dosage) is. the principal 
underlying cause of psychiatric disease in this patient. 

The gene disrupted in this patient is both expressed 
in the brain and participates in key physiological 
processes in the CNS . Notably, the gene may be involved in 
the alteration of the strength of synaptic/neural 
transmission, a phenomenon known as long-term potentiation 
(LTP) . LTP is postulated to underlie cognitive functions 
such as learning and memory. Moreover, cognitive testing 
has previously established that these functions are 
frequently affected in patients with schizophrenia. 

PDE4B 

Stimulation of the G protein coupled 

receptor/heterotrimeric G protein pathway results in the 
synthesis of the secondary messenger, cAMP, by members of 
the adenylyl cyclase family of enzymes. This secondary 
messenger triggers a well-characterised signalling cascade 
that is principally mediated by cAMP-dependent protein 
kinase A (PKA) and cAMP-resposive transcription factor, 
CREB, both of which have been implicated in the molecular 
pathways of LTP (Abel & Latal, 2001). cAMP signalling is 
attenuated by its breakdown by members of the 
phosphodiesterase enzyme family. Four members of the PDE4 
sub-family of cAMP phosphodiesterases have been identified 
to date (PDE4A'PDE4D) . These four genes are the human 
homologues of the Drosophila. learning and memory mutant 
gene, jDunce . The long form of the PDE4B protein, PDE4B1, 
is the only splice form with brain expression and the 
present inventors have shown that it is disrupted in the 
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subject. Anti-PDE4B antibodies revealed expression within 
the inferior olive, the hypothalamus, the ventral striatum, 
the cerebellar molecular layer, globus pallidus, nucleus 
accumbens and substantia nigra (Cherry & Davis, 1999). The 
authors of this expression study suggested that PDE4B 
expression strongly correlates with brain areas underlying 
reward and affect in niaTnnials. In addition, PDE4 proteins 
are recognised as the molecular targets for Rolipram, a 
drug with anti-depressant effects. Rolipram inhibition of 
PDE4 activity has been shown to improve long-term 
hippocampal LTP and spatial memory in mice (Barad et al, 
1998 and Bach et >al, 1991) . The (heterozygous) disruption 
to PDE4B1 described here may be equivalent to 50% reduction 
of protein product in the brain. This could result in a 
greater cAMP half-life and a concomitant increase in the 
activation of downstream cAMP targets. 

In addition, the disruption to PDE4B shows reduced 
penetrance as not all translocation carriers present with 
psychiatric illness (although all members of the extended 
family with psychiatric illness possess the translocation 
karyotype; data not shown). 

Example 6: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 4 

FISH experiments on chromosome 16q21 had narrowed the 
location of the breakpoint to a region including the large 
gene CDH8 (approximately 400kb genomic extent) . Three BAG 
clones were selected from the tiling diagram of BAG clones 
placed on the human genome map backbone (June 2002 release 
of the "BAG End Pairs' track on the UCSG Genome Browser; 
http ; / /genome . cse . ucsc . edu/ index . html?ora=Human) . These 
were RPGI-11 599cll, RPGI-11 875el2 and RPGI-11 685m21. By 
FISH, these BAG clones flanked the breakpoint (the first 
two translocated the derived chromosome 1 whereas the third 
remained' on the derived chromosome 16) . The position of 
these three BAG clones indicated that the breakpoint lay 
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within the large (lOOkb) intron between exons 1 and 2 of 
the CDH8 gene (see Fig. 2). Thus, the inventors inferred 
from these results that the CDH8 gene was directly 
disrupted by the 16q21 translocation event and, as such, is 
a candidate gene for the psychiatric disorder exhibited by 
the patient. The similar disruption of the PDE4B gene on 
chromosoiT^s 1 and their relative orientations on the two 
chromosomes raised the possibility that the derived 
chromosomes (the two chromosomes resulting from the 
translocation: der(l) and der(16)) could transcribe 
fusion/hybrid genes. This has been frequently seen in cases 
where a translocation gives rise to susceptibilty to 
cancers. In essence, the translocation in the proband 
resulted in an exchange of the two genes' promoter and 
first exon sequences. On the der(l) the promoter and first 
exon of the CDHd gene are juxtaposed to exon 2 and 
downstream of the PDE4B gene (see Fig. 33). However, the 
reading frames of these two gene segments are not the same, 
resulting in a prematurely truncated peptide with only the 
signal peptide, proprotein fragment and a small portion of 
the cadherin domain contained within (see Fig. 37a) . This 
would be expressed in the same cell types/ tissues as the 
normal CDH8 gene but the functional/pathological 
significance of this small peptide is not clear at the 
current time. On the der(16) the PDE4B promoter and exon la 
are juxtaposed to exon 2 and downstream of the CDH8 gene 
(see Fig. 33). Exon la of PDE4B does not contain a 
translation start-site so the reading frame compatabilities 
of the putative fused transcript are not an issue. However, 
exon 2 and downstream of the CDH8 gene contain several ATG 
start-sites which could be employed by translational 
machinery to generate peptide sequences. In two of the 
reading frames, any generated peptides would be small and 
probably of no consequence. The third reading frame (the 
normal CDH8 reading frame, see Fig. 5b) contains three ATG 
start-sites early on, with the second of these forming a 
very good match to the canonical Kozak sequence found at 
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most translation start-sites (CCAxxATGG) . If this one is 
used then the resulting peptide will be identical to normal 
CDH8 protein but lacking the N-terminal portion encoding 
the signal peptide, proproteih fragment, the first cadherin 
domain and most of the second cadherin domain. Although the 
bulk of the peptide sequence is as the normal CDH8 protein, 
the lack of the N-terminal sequences may prevent the 
protein from entering the Golgi/ER subcellular compartments 
- a process that is required for the correct insertion 
in/trafficking to the cell membrane. The 
functional/pathological consequence of the presence of this 
truncated form pf the CDH8 protein in the cytoplasm of 
tissues where the long form of the PDE4B gene is expressed 
is uncertain at this point. 

In summary, the psychiatric illness seen in the 
proband, and other members of the family, may be the result 
of one (or a combination) of the following circumstances: 
the loss (through disruption) of one allele of PDE4B , the 
loss (through disruption) of one allele of CDH8 or the 
generation of potentially pathological fusion polypeptides. 

Cadherin-S was first cloned in humans (Tanihara et 
al., 1994) and later in mouse (Munro et a., 1996) and rat 
(Kido et al., 1998). Sequence analysis immediately placed 
the gene product within the large family of membrane- 
spanning proteins with extracellular cadherin domains 
thought to mediate calcium-dependent hemophilic 
interactions between adjacent cells. As such, the cadherins 
are members of the functionally defined group of cell 
adhesion proteins . 

CDH8 is a member of the Type II, or atypical, 
cadherins which are defined by the lack of an extracellular 
tripeptide motif, HAV, possibly involved in the binding 
specificity of Type I cadherins. Fig. 2 illustrates the 
structure of CDH8 protein which includes an extracellular 
domain containing 5 copies of the cadherin domain, a 
membrane spanning domain and a C-terminal cytoplasmic tail. 
The cytoplasmic tail is thought to signal the presence of 
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interactions to the intracellular compartment by mediating 
receptor clustering through interaction with the proteins 
such as 3-catenin, a-catenin and, eventually, the 
cytoskeletal proteins, actin and a-actinin. In this way, 
adhesion to adjacent cells can affect the cytoarchitecture 
of the cell and may even play a role in cell motility. 

The two principal roles of neuronal cadherins are 
thought to be in the mediation of certain developmental 
pathways in the brain and the regulation of synaptic 
function. The hemophilic nature of cadherin interaction 
(i.e. CDH8 proteins preferentially bind to other CDH8 
proteins) has prompted the hypothesis that cadherins are 
responsible for the aggregation or interconnection of 
similar cells within an organ. This has been shown to be 
the case in the brain where CDH8 expression has been shown 
to be restricted to particular subregions and even neuronal 
patches (Redies, Bishop, Rubenstein, Korematsu X 2). 

The major cadherin in the brain, N-cadher in , (encoded 
by CDH2) , has been implicated in synaptic long-term 
potentiation (LTP) : the mechanism thought to underlie 
learning and memory on the* brain (e.g. Huntley et al,, 
2002 & Bozdagi et al., 2000,). Other cadherins may also 
play a part in this process (Uemura, 1998 & Tang et al., 
1998). In essence, cadherins seem to form physical bridges 
across the synaptic cleft which may modify synaptic 
efficacy and/or spine morphology (two features of neurons 
demonstrated to change after the induction of LTP) . 

Interestingly , two of the hypotheses used to explain 
the origins of psychiatric illness are, firstly, the 
occurrence of abnormal brain development and, secondly, the 
existence of deficits in cellular pathways manifested as 
poor performance in certain cognitive/memory tasks. The two 
roles of neuronal cadherins seem to closely mirror these 
two hypotheses suggesting that CDH8 is a good functional 
candidate for psychiatric illness. 
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